Tag

#AI optimization

31 articles

Dutch company ASML is $300bn from a trillion. AI could close the gap

Learn how to simulate ASML's EUV lithography technology and AI optimization systems using Python. This beginner-friendly tutorial teaches you to create data simulations, visualize manufacturing metrics, and build simple AI models that could optimize chip production.

Jul 1916

Microsoft is reportedly training salespeople to talk down OpenAI and Anthropic

This article explains how AI model efficiency optimization works and why it matters for enterprise AI competition, using Microsoft's strategy against OpenAI and Anthropic as a case study.

Jul 1511

A Coding Guide to NVIDIA’s Tile-Based GPU Programming: From cuTile and Triton Kernels to Flash Attention

This article explains tile-based GPU programming concepts, focusing on NVIDIA's cuTile and Triton frameworks, and how they enable efficient Flash Attention in large language models.

Jul 1133

tech

Meet Nemotron Labs 3 Puzzle 75B A9B: A Compressed Hybrid MoE LLM Delivering 2.03x Server Throughput

NVIDIA introduces Nemotron-Labs-3-Puzzle-75B-A9B, a compressed hybrid MoE LLM delivering 2.03x server throughput, leveraging hardware-aware compression and knowledge distillation.

Jul 934

tech

I've been reviewing laptops for years: These are the 15+ best July 4th laptop deals

This article explains how artificial intelligence and machine learning optimize retail pricing and promotional strategies, using laptop sales as an example of sophisticated data-driven decision making.

Jul 220

Meta's non-invasive brain-to-text AI is closing the gap with surgical implants

Meta's new non-invasive brain-to-text AI system, Brain2Qwerty v2, translates brain activity into typed sentences without requiring surgery. The technology is advancing rapidly, with AI optimization playing a key role.

Jul 136

OpenAI reportedly cut response costs for guest ChatGPT users by more than half

OpenAI has reportedly cut inference costs for its AI models by more than half, significantly reducing the number of GPUs needed to process ChatGPT responses.

Jun 3038

DFlash Speculative Decoding Drafts Whole Token Blocks in Parallel for Up to 15x Higher Throughput on NVIDIA Blackwell

Researchers at UC San Diego introduce DFlash, a new speculative decoding technique that drafts whole token blocks in parallel, achieving up to 15x throughput improvement on NVIDIA Blackwell.

Jun 2358

Cisco AI Introduces FAPO: Pipeline-Aware Prompt Optimization With Step-Level Failure Attribution and Claude Code Orchestration

Learn how FAPO, a new AI tool from Cisco, automatically improves AI prompts by analyzing each step of a task to make AI systems more accurate and reliable.

Jun 2056

The KV Cache Compression Race: TurboQuant vs OSCAR vs EpiCache

As KV cache memory outpaces model weights in large language models, three compression techniques—TurboQuant, OSCAR, and EpiCache—are emerging as key contenders. While each offers distinct methods for optimization, they are seen as complementary rather than competitive.

Jun 1855

Microsoft's SkillOpt boosts GPT-5.5 by using nothing but a trained Markdown file

Learn how to create and apply SkillOpt Markdown files to dramatically improve AI agent performance on procedural tasks, boosting models like GPT-5.5 by 23 points.

Jun 1360

Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation

Researchers introduce GEPA, a reflective prompt-evolution framework that enhances small language models' performance on multi-step arithmetic problems through structured feedback and multi-component prompt design.

Jun 752